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The demonstration and use of nonlocality, as defined by Bell’s theorem, rely strongly on dealing 
with non-detection events due to losses and detector inefficiencies. Otherwise, the so-called detection 
loophole could be exploited. The only way to avoid this is to have detection efficiencies that are above 
a certain threshold. We introduce the intermediate assumption of limited detection efficiency, e.g. in 
each run of the experiment the overall detection efficiency is lower bounded by r\ m in > 0. Hence, in 
an adversarial scenario, the adversaries have arbitrary large but not full control over the inefficiencies. 
We analyse the set of possible correlations that fulfil Limited Detection Locality (LDL) and show that 
they necessarily satisfy some linear Bell-like inequalities. We prove that quantum theory predicts 
violation of one of these inequalities for all r\ m i n > 0. Hence, nonlocality can be demonstrated with 
arbitrarily small limited detection efficiencies. Finally we propose a generalized scheme that uses 
this characterization to deal with detection inefficiencies, which interpolates between the two usual 
schemes, postselection and outcome assignment. 


Introduction — When studying the discoveries in fun¬ 
damental physics of the past century one cannot help 
but come across Bell’s seminal work p] on the nonlo¬ 
cal nature of quantum theory. It implies that quantum 
mechanics can produce correlations which cannot be ex¬ 
plained by a common past with local variables propagat¬ 
ing contiguously. This has not only proven fascinating 
from a foundational point of view, but also given rise 
to applications in device independent quantum informa¬ 
tion processing [2] (DIQIP), like quantum key distribu¬ 
tion [3H5], randomness generation [B] [7] or entanglement 
certification BM- 

Let us briefly recall the concept of local and nonlo¬ 
cal correlations. Assume that a source emits particle 
pairs that travel to two distant labs, in which two ex¬ 
perimenters, traditionally called Alice and Bob, perform 
measurements on them (cf. fig. [l]). Alice locally per¬ 
forms one of several possible measurements and records 
the outcome, as does Bob. We denote Alice’s and Bob’s 
measurement choice by X and Y and their recorded out¬ 
comes by A and B , respectiveljQ By doing so, they can 
compute the correlation Pab\xy ■ Given the setup, it 
seems natural to think that any correlations that Alice 
and Bob can observe in this way are due to the parti¬ 
cles having a common past, as they come from the same 
source. We refer to this common past by A. Correlations 
that can be explained by the existence of such a A are 
called local: 

P L {ab\xy) = /dAp(A)P(a|,A)P(%A). (1) 

Bell’s work showed that there are quantum correlations 
that cannot be reproduced by such a local model, prov¬ 
ing that quantum mechanics is inherently nonlocal. This 
fact has since been demonstrated in a multitude of exper¬ 
iments (see e.g. [2j and found use in applications jUEHH]. 


* Gilles.Puetz@unige.ch 

1 Notation: we use capital letters to denote random variables and 
lower case letters to denote the values these variables can take. 




FIG. 1. Two boxes are programmed by a hidden common 
strategy A. The boxes are given inputs X and Y and return 
outputs A and B. There is the possibility for nondetection 
events, in which case the corresponding output variable takes 
the value 0. 


However, when demonstrating quantum nonlocality, 
several issues have to be dealt with. Here we are specif¬ 
ically interested in one of them: what happens if the 
particles can be lost on the way to or inside the labs, in¬ 
cluding the possibility that the particles reach the detec¬ 
tors but are simply not registered by them. In this case 
we say that A = 0 or B = 0. One immediate idea is of 
course to carefully analyse why the particles get lost and, 
if the mechanisms are well understood, to simply discard 
these cases. This means that Alice and Bob postselect 
on the cases in which they both registered a detection: 
4/0 and B / 0. However, this opens up the possi¬ 
bility that fully local correlations appear nonlocal if our 
understanding of the cause of the non-detections is in¬ 
correct, a situation that we wish to avoid 110, fllj . This 
is especially relevant in the case of active adversaries in 
DIQIP applications. Another option is to consider the 
nondetection events as an additional possible outcome 
and simply check if the resulting correlation is nonlocal. 
In this case one will never mistake a local correlation for a 
nonlocal one. The drawback however is that even highly 
nonlocal distributions may now appear local. 

In the end, the only way to deal with this issue, usu¬ 
ally called the detection loophole, consists of not only 
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producing highly nonlocal correlations but also having a 
high enough detection efficiency. If the latter is not sat¬ 
isfied, even a perfect state preparation and perfectly cal¬ 
ibrated measurement apparatuses do not help and one is 
left with an inconclusive experiment unless one assumes 
that the detection loophole is not exploited. 

In this paper, we introduce the concept of limited de¬ 
tection locality. It consists of an intermediate assump¬ 
tion between neglecting the detection loophole and clos¬ 
ing it completely. We show that this assumption, even 
when arbitrarily weak, allows one to demonstrate nonlo¬ 
cality by postselection even with arbitrarily low overall 
detection efficiency. In addition, we show that the two 
previously mentioned methods of dealing with detection 
inefficiencies (postselection and assignment to an addi¬ 
tional outcome) can be seen as a special case of a more 
general method that we present below. 

Limited Detection Locality (LDL) — We now intro¬ 
duce the assumption of limited detection efficiency. As¬ 
sume that there exist a fixed y m i n and r) max with 
[y mintfjmax] ^ [0; 1] Such that 

y min — P(A ± 0\xX) < T]max (2) 

and similarly for Bob. This corresponds to the assump¬ 
tion that, for any input x and any common local variable 
A, there is a probability of at least y m in and at most ri max 
of having a detection. Consider for example a world in 
which the polarization degree of freedom of photons was 
as of yet undiscovered. It is nowadays well known that 
the detection efficiency of almost all types of detectors is 
indeed susceptible to polarization. However the detection 
efficiency never goes up to 1 or down to 0, which corre¬ 
sponds to our assumption of limited detection efficiency 
with nontrivial y m in and r] max . We refer to correlations 
fulfilling conditions 0 and 0 as limited detection local. 
Note that technically the case of [rjmin, Vmax ] = [0,1] can 
still be analysed by our techniques and in fact we would 
recover the results of Branciard ] ; 10j. 

In an experiment, one can additionally determine 
the actual observed detection efficiencies, which may of 
course be different for the different sets of inputs. It is 
reasonable to assume that some detections occurred for 
all possible sets of inputs. 

P(a 0 |x) = y x > 0- (3) 

Since 

P(a ^ 0\x) = J dXp(X)P(a ^ 0|xA), (4) 

we have that y rn in < y x < rj max ■ All of this holds analo¬ 
gously for Bob’s side. To ease notation, we are going to 
define rj xy = y£yfj. 

We can now focus on the postselected limited detection 
local distributions given by 

P(ab\xy, a ^ 0, fr ^ 0) = P ^ xy ' > . (5) 

rjxy 


Similarly to local correlations, these postselected lim¬ 
ited detection local correlations fulfil certain conditions. 
More precisely, they form a convex polytope m and 
therefore respect a set of linear Bell-like inequalities. 
Making the additional assumption that rj xy = y x ’ y ' for 
all x,x',y,y', one of these inequalities is, for example, 
given by 

Vmin P ( 00 |OO,a^0,6^0) 
VminVmaxP(Q]- ^ 7^ ^ 1 ^ 7 ^ ^0 
10 ? ® 7^ 0? ^ 7^ 

-^L^(OO|H,a^0,^0)<O. (6) 

In a experiment with given losses, the experimenter can 
check for which values of rj m i n and rj max his observed 
correlations violate this inequality. He can then conclude 
that no limited detection local model with these param¬ 
eters could have reproduced them. 

Interestingly, there are quantum correlations that do 
not fulfil this inequality independent of the observed de¬ 
tection efficiences y xy (including the case where the de¬ 
tection efficiency is different for each input pair) and for 
any upper bound ij max as long as rjmin >0. In fact this 
is achieved by all quantum correlations violating Hardy’s 
paradox |13j and can therefore be realised using any suf¬ 
ficiently pure partially entangled 2-qubit state with the 
right set of projective measurements. This may be quite 
surprising since it is well-known that without the as¬ 
sumption of limited detection efficiency ([2]), a minimal 
observed detection efficiency of y = 'Ixy/d > § is 
required to demonstrate nonlocality for 2 parties using 
binary inputs and outputs. However, making the arbi¬ 
trarily weak additional assumption that P(a ^ 0|xA) > 
rjmin > 0 allows one to demonstrate quantum nonlocality 
despite arbitrarily large losses and detection inefficien¬ 
cies. 

A more general method of dealing with detection 
inefficiencies — It is possible to impose any desired rj min 
at the price of adding some noise to the system. Assume 
that Alice and Bob set their detection systems, which we 
assume to have an efficiency y, such that any time a non¬ 
detection event occurs, the system still gives an outcome 
with probability y m i n . In this way, Alice and Bob impose 
that their detection systems have limited detection effi¬ 
ciency given by the chosen y m i n and y max = 1 and they 
can treat the resulting correlations by the tools presented 
above. This however comes at the price of adding local 
noise to their correlations. In fact, assume that Alice and 
Bob would share the nonlocal correlation P^l if the de¬ 
tectors were perfect and there were no losses (e.g. given 
by projective measurements on a pure quantum state) 
and denote by P$ L and P^ L the marginal distributions 
of Alice and Bob respectively. In the nondetection cases 
the detection systems are set up such that they give with 
probability y mln an outcome given by the local distribu¬ 
tions P£ and Pjf respectively. Then, by postselecting on 
the cases where the detection systems gave an outcome, 
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Alice and Bob share the correlation 

p = ( V 2 Pnl + - v)Vmin(P& L Pl! + PlPnl) (7) 


+ (! - V) 2 vLn P L P L 


(v + (1 - V'jVmin)^ 


( 8 ) 


They can then analyse this correlation using the tools of 
limited detection efficiency presented above. 

In fact, in the introduction we mentioned the possibil¬ 
ity of dealing with losses and detector inefficiencies by as¬ 
signing the nondetection events to an additional outcome 
and treat the resulting correlations using the usual tools 
of nonlocality. This corresponds exactly to the strategy 
we just presented with rj m i n = 1. However, our approach 
is more general, allowing to assign only a fraction of the 
nondetection events to an outcome and postselecting on 
the rest. For a fixed detection efficiency 77 , our method 
therefore encompasses both of the previous strategies, full 
postselection and full assignment to an additional out¬ 
come, and additionally allows for an arbitrary mixture of 
the two. It is at this point not obvious to us that for a 
given experiment (meaning for a given Pnl and a given 
77 ), all of these strategies would yield the same result. We 
leave it up for future works to analyse this question in 
more detail. 

Link to measurement dependent locality (MDL )— An¬ 
other way to counterfit nonlocal correlations using only 
local resources is if the common history A is correlated 
with the inputs X and Y. If the correlation can be arbi¬ 
trary, then any nonlocal correlation can be counterfitted 
in this way, so limitations have to be imposed to be able 
to make any conclusions. Together with some coauthors, 
we recently studied the case of measurement dependent 
local correlations m that are defined in the following 
way: 


P(abxy) = J dXp(X)P(xy\X)P(a\xX)P(b\yX) (9) 

l < P(xy\X) < h. (10) 

Note that if Alice and Bob each have N inputs, then 0 < 
t < -^2 < h < 1 due to the normalization of probability 
distributions. Similarly to this paper, we showed that the 
set of MDL-correlations for fixed t and h can be analysed 
using Bell-like inequalities. 

It turns out that there exists a strong link between the 
concepts of limited detection locality and measurement 
dependent locality. Indeed we make this connection ex¬ 
plicit by the following theorem, which we state loosely 
here and more explicitly in the appendix: 

Theorem: Assume that we have a correlation that can 
be produced by using a combination of postselected lim¬ 
ited detection © and measurement dependent ( | 10 [ ) local 
([9]) resources, with parameters ( r) m i n , r] m ax) and JT^h ), re¬ 
spectively. Then this correlation can also be reproduced 
using only measurement dependent local resources with 

£/ _ timing anc J /j' = r >Tn a n 

^rnax Vrnin 

Intuitively, the link comes from the fact that the way 
to exploit postselection for an adversary is to not answer 


when they do not like the input, resulting effectively, via 
postselection, in them influencing the inputs. A con¬ 
sequence of this theorem is that whenever a correlation 
cannot be reproduced by a measurement dependent local 

model with bounds £! and h 1 , then it can also not be real- 

2 

ized using limited detection efficiencies with > n 2 £ 

7 Imax 

2 

and ri " ia * < N 2 h where N is the number of inputs for 
each of the two parties. This allows us to use any result 
derived for the MDL-scenario and apply them to LDL- 
correlations. Even more interestingly, we are now able 
to deal with the problems of losses and measurement de¬ 
pendence in a straightforward way since we can simply 
focus exclusively on measurement dependence. 

Conclusion — Losses and detection inefficiencies have 
been a long-lasting thorn in all experimenters side. They 
are a big part of the reason that a loop-hole free Bell- 
test has to this day not been conducted while also being 
one of the main weak points that an adversary will at¬ 
tack in any task whose security relies on quantum me¬ 
chanics. To help deal with both of these issues from a 
theoretical point of view, we introduced the additional 
assumption of limited detection efficiency <©■ The as¬ 
sumption at its core corresponds to assuming that the 
inefficiencies in the setup are only partially exploited, an 
idea that we consider very intuitive in its nature. For the 
case of a fundamental Bell test, assuming that nature is 
non-malicious, this idea seems very natural. However, 
when dealing with an adversary in device independent 
quantum information processing tasks, it is less obvious 
to motivate the assumption in the general scenario due 
to detector blinding attacks and similar measures. The 
concept can be used to draw stronger conclusions in any 
experiment that does not fully close the detection loop¬ 
hole. Its main appeal lies in the fact that even when the 
limitation assumption is made arbitrarily weak, the non¬ 
local nature of quantum mechanics can still be revealed 
with arbitrarily low overall detection efficiency. 

In addition, we introduced a generalized method to 
deal with detection inefficiencies in general. The method 
includes the usual methods of postselection and assign¬ 
ment to an additional outcome as special cases and allows 
to mix the two. It is an open question whether or not 
for a given experiment one of this continuum of methods 
trumps the other ones or if they are all equaivalent. We 
leave this question for future research. 

Finally, we connected the ideas of limited detection lo¬ 
cality and measurement dependent locality. We showed 
that in fact results from studying measurement depen¬ 
dent local correlations can be applied to the case of lim¬ 
ited detection locality. Moreover, it is possible to deal 
with detection efficiencies and lack of measurement inde¬ 
pendence at the same time, which we hope will be of use 
for future Bell experiments. 
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Appendix A: Polytopal structure of Limited Detection Local correlations 


Consider the case where N parties perforin a nonlocality experiment. The input and outcome of the i-tli party 
will be denoted by A,; and A* respectively. We consider the case where nondetection events can occur, they will 
be denoted by Ai taking the value 0. We will denote by A! i the outcome of party i after postselecting on having a 
detection. As discussed in the maintext, we make the assumption of limited detection locality and show that these 
correlations form a polytope. The theorem stated here is more general than needed for the maintext, where we only 
consider the case of 2 parties. 

Definitions: Let {A, ; }^ =1 , {A'}A 1; {X i }A 1 be sets of random variables with alphabets {1 ■ ■ ■ nii,0}, {l---mj} 
and {1- ■ ■ ni} respectively, rrii,ni £ J\f. In the following, the corresponding lower case letters will denote values in 
the respective alphabet. We will denote probability distributions over a random variable V by Py, the value of this 
distribution for a given value of V by Py(v). For ease of notation, we will often omit the random variable and just 
write P(v). We will denote conditional probability distributions over a random variable V conditioned on a random 
variable W by Py\w- I n the case of continuous random variable we denote the probability density by py. In the 
following we assume that all the probability distributions are well defined. The set of all probability distributions 
over V will be denoted by Vy and of all conditional probability distributions over V conditioned on W by Py\w- 

We define the following sets: 


• The sets of 1-party distributions with limited detection: 


CD.fir), 


mini '/max 


) — ^ ^Ai\Xi ^ P Ai\Xi • 1 7]min 5: PAi\Xi (0|#) ^ 1 7~]max V# £ {0 • • • 77^} j* 


• The set of TV-party limited detection local distributions: 

{jlmax^i}i=\) = ^Pa±...An\Xi...Xn £.PA\...An\X\...Xn ’ 


P(a 1 ...a N \x 1 ...x N ) = / dAp(A) P(ai | Xi A), 

** i=1 

■^ > Ai\XiA=X ^ ^^i{7]rnin,i:7] rna x,i)^^^'^ 
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• The set of TV-party postselected limited detection local distributions: 

CVCVS N {{r} min ^}^L 1 ,{r] maX) i }i=l i \jlx \...xn }#i ---Xn ) ^ Pa' 1 ...A' n \X\...Xn ^ ^A' 1 ...A' n \X\...Xn * 

A\...An\X\...Xn G CVC({i) min .i}f =1 ,{i] 

max i= 1) 5 

Q{ai ^ 0 ...a N ^ 0\x\. ..x N ) = r] Xl ... XN , 

, , Q(a[...a' N |zi...a; j v)\ 

Pitt!... a N \xx... x N ) = ---f • 

'Hx- i _...XN J 


We further define the following two sets, which we will prove to be the vertices of C'D i (r] m i n ,ri max ) and 

CVCN({Vmin t i}^i,{Wmax,i}^i)- 

^ X (jl mm^max) — Ai\Xi • Vx G {1 . . • Tli^3\ci x G {1 • • • 1Tli\ &nd T) x G \j]mim Vmax } S.t. 

V(a x \x) = r) x , V(0\x) = 1 — r] x and otherwise V{a\x) = o| 


V ^ *=l ? }i=l) ^^Ai...A at|Xi — Xn £'PAi...An\Xi...Xn • 

3TT G V * Vmax,i) S.t. 

N 

V(ai. ..a N \x\. ..x N ) = JJ ^(ailajj)! 

»=l 

With these definitions, we can now state the theorem. It refers to polytopes, which for the purposes of this work 
are simply seen as a convex structure with a finite set of vertices. Equivalently they can be defined by a finite set of 
inequalities. 

Theorem: For fixed TV, {VminAiLi and {VmaxAiLi, {Vmax,i}iLi) is a polytope 

whose vertices are a subset of V CVCn {{r]min,i}iLi, {Vmax,i}iLi)- Furthermore, for fixed {?y Xl ... a;jv } a! 1 ... a; j V ), 
PPPPR N{\j)min,i\i= 1 i {VmaxAi=l > {_Vxi...xn }xi-..xjv ) also a polytope. 

Proof: To ease notation we will omit writing TV, {rjmin,i}iLi and { r lmax,i]f=i f rom now on. The first part of the 
theorem follows from the following two lemmas. 

Lemma 1: CDi is a polytope with vertices given by V CT>i . 

Proof: Due to the normalisation of probability distributions, i.e. 

y ' p^dj \xj ) — i Vxj, 

di 

we have that P(0\xi) = 1 — P{ai\xj] and we can therefore work in the lowerdimensional subspace 

given by a,; G {l.-.m*}. In this subspace, we are then left with the polytope defined by the inequalities 
Vmin < P( a i\ x i) < Vmax- This is the definition of a lrypercube whose vertices are defined by the corresponding 

part of V £ ®‘. 

The Lemma follows. □ 

Lemma 2: Let Q and P be polytopes with vertices Vq and Vn respectively. Let S = |5 : 3A s.t. S(u,v) = 

f d\p(\)Qx(u)R\(v) with Q G Q,R G p}. Let Vs = jys : V s (u,v) = V Q (u)V R (v) with Vq G Vq, V r G Vr,}. 

Then S is a polytope whose vertices are a subset of Vs. 


Proof: By definition, S is convex and Vs G S. 
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Let S G S, then by definition we have: 

S(u,v) = J dXp(X)Qx(u)R x (v) 

= f dXp(X)J2qx,iV^{u)^2rx,jV^{v) 
i j 

= ( / dX P( X fo\,ir\,^ V Q(u) v ^( v ) 

m J 

= Y, s ^ v sM 

( ij) 


In the first step we use the fact that any element of Q and 7 Z can be written as a convex combination of their vertices 
and we define qx,i > 0, JT gyi = 1 and rxj > 0, JT r‘xj = 1. In the last step we defined Sij = J dXp(X)qx,ifX,j , 
which fulfils > 0 and yU. = 1 and also used the definition of Vg 7 . 

This proves the Lemma.□ 

Using these two Lemmas in conjunction (and using Lemma 2 iteratively) proves that CDC is a polytope and that 
VcHU contains its vertices. 

To finalize the proof of the theorem we need to show that CD CVS is a polytope as well. This can be seen directly 
since the set is obtained by slicing CDC with the hyperplanes defined by P(ai ^ 0 ... a n 7 ^ 0|a;i... Xn ) = V Xi ...xn- 
Cutting a polytope with hyperplanes results in another polytope. The final step is a simple rescaling of the entries 
(equivalent to rescaling the axes) and therefore the set remains again a polytope. 

This proves the theorem. □ 


Appendix B: Limited Detection Locality and Measurement Dependent Locality 

In this section we prove the link between limited detection local and measurement dependent local distributions. 
This can be proven more generally, here we only present the 2-party version. 

Definitions: We introduce the random variables Da and Dg with alphabet {0,1} such that Da = 0 if and only if 
A = 0. We define the set of limited detection local distributions allowing for measurement dependence: 

MVCVC(£, h, rj min , r] max ) = {Pad a bd b xy :P(ad A bd B xy ) = j dXp{X)P{xy\X)P{ad A bd B \xyX), 

Vmin — PD A D B \XYK{^\xyX) < 

Vmax i 

P(adAbd B \xyX) = P(adA\xX)P(bd B \yX) 
i < P(xy\X) < h, 

J dXp(X) = l,p(A) > 0} 

We also define the set of measurement dependent local correlations: 

MVC(h ,£) = {Pabxy :P(abxy) = JdXp(X)P(xy\X)P(ab\xyX), 

P{ab\xyX ) = P(a\xX)P(b\yX) 
t < P(xy\X) < h, 

J dXp(X) = 1, p(X) > 0} 


Theorem: If 


PAD aBDbXY £ AAD CfDCt(Jl , /l, rjmin) Ornate) 
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then 


Pabxy\d a =i,d b =i 


e MVC( rlnn ^e, 

T]max 


Vmax 

IJmin 


h). 


Proof: We have 

1. P(adAbdBxy) = f dXp(X)P(xy\X)P(adAbdB\xyX) 

2 . P(adAbds\xyX) = P(adA\xX)P(bdB\yX) 

3 - ymin — PD A D B \XYA(H\xyX) < 

T)max 

4 . £ < P(xy\X) < h. 

Let us prove a few implications: 

• If (ADa\X) and (BDb\Y) are local, meaning that they fulfil condition 2, then {D A \X) and (Db\Y) are also 
local: 


P(d A d B \xyX) = E P(ad A bd B \xyX) 

a,b 

= ^2 P(ad A \xX)P{bd B \yX) 

ab 

= P(d A \xX)P(d B \yX). 


• If (ADa\X) and (BDb\Y) are local, then (A\D a X) and (B\DbY) are also local: 


P(ab\xd A ydsX) 


P^adAbdslxyX) 
P(d A d B \xyX) 
P(ad A \xX) P(bd B \yX) 
P(d A \xX) P(d B \yX) 
P(a\xd A X)P(b\yd B X). 


• Knowing less cannot result in knowing more, meaning that upper and lower bounds on P(y\v<j) also hold for 
P{y\v)-. Assume P(y\is<r) < h, then 


P(dW) = ^2P(cr)P{dW<x) 

a 

< /l^P(cr) 

(T 

= h 

where we used that P(cr) = 1. The same holds for lower bounds £ < P{y\vcr). Due to this, condition 3 
implies 


ymin — Pd a d b |a(H|A) < Tjmax • 


Using the implications above, we can show that the conditions imply bounds on P(xy\D A = 1,-D.b = 1, A): 

rimin<---<rimax 


P{xy\D A = 1 ,D B = 1,A) = 


P(D A = l,D B = l\xyX) 
P(D a = 1,D b = 1\X) 


Pjmin 'Hn 


P(xy | A) 


e<-<h 


T ^£ < P{xy\D A = 1, D b = 1, A) < ^h. 


Vmax 


T]min 



We can now prove the theorem: 


P{abxy\D A = 1,D B = 1) = j dXp(X\D A = 1 ,D B = l)P(xy\X,D A = 1 ,D B = l)P(ab\xyX,D A = 1 ,D B = 1) 


Pm ax — — Vmin 


= j dXp(X\D A = 1 ,D b = 1) P(xy\X, D a = 1, £> b = 1) • 
P(a\xX,D A = l)P(b\yX,D B = l) 


This is by definition an MDL-correlation: 


P(ABXY\D a = 1,D b = 1)€ MVC{^Xe, VlIDXh) 

T]max TJmin 



